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ABSTRACT 


It  is  of  general  interest  to  find  criteria  for  a  matrix  to  be  positive 
(or  negative)-  semidefinite.  The  usual  characterization  of  semi- 
definite  matrices  in  terms  of  their  principal  minors  can  be  rather 
laboi'ious  to  implement  practically.  We  present  here  an  elementary 
proof  of  a  known  alternate  characterization  of  a  semidefinite  matrix 
in  terms  of  its  null- space  and  of  its  largest  characteristic  value.  An 
iterative  procedure  is  also  suggested  which  may  be  useful  in  deciding 
the  semidefiniteness  of  a  matrix. 


A  NOTE  ON  SEMIDEFINITE  MATRICES 


In  what  follows  A  will  always  represent  a  real,  symmetric,  nxn 

n  (*)  T  (♦*) 

matrix.  If,  for  each  xcR  it  is  true  that  xAx  >  O'  then  we  say 

T  T 

that  A  is  positive- semidefinite,  denoted;  p.  s .  d.  ;  if  (xAx  ){yAy  )  >  0 
for  all  X,  y  e  R*^  we  say  that  A  is  semidefinite,  denoted  s.  d.  .  We  first 
prove  the  following: 

THEOREM  1.  The  following  are  equivalent: 


(i) 

A  is  s .  d. 

(ii) 

(xAy^)  <  (xAx^XyAy^)  , 

all 

(iii) 

t,n  A  T  _ 

X  e  R  ,  xAx  =  0  xA  X  = 

0 

(iv) 

_n  a2  T  ,  ^  ,  A  T.2 

X  e  R  ,  xA  X  =  1  ^=4  (xAx  ) 

>  0 

(v) 

xcR^,  xAx"^  =  0  =4  xA  =  0 

PROOF:  We  show  (i)  (ii)  =4  (iii)  =4  (iv)  =4  (v)  =4  (i). 

(i)^  (ii) 

Suppose  A  is  s.  d.  ,  letx,yeR*'.  Consider  the  real  quadratic  polynomial  p 
defined  by; 

p{\)  =  (x  +  Xy)A(x  +  Xy)^  = 

T  T  2  T 

=  xAx  +  2X.xAy  +  X.  yAy  . 

Since  A  is  s.  d.  ,  p  does  not  change  sign,  i.  e.  ,  its  discriminant  is  non¬ 
positive,  whence: 

2 

4{xAy^)  -  4(xAx^)(yAy^)  <  0  , 


'(WJ 


{x|x  = 

,,  —n  T 
If  X  c  R  ,  X 


(x, ,  .  .  .  ,  X  )  and  x.  is  a  real  number  for 
1  n  1 

denotes  the  transpose  of  x. 


i. 


giving  the  desired  result. 


(ii)  =»  (iii). 

n  T  T  ^ 

Suppose  xeR  and  xAx  =0,  then,  from  (ii),  (xAy  )  <  0,  i.  e.  , 

xAy^  =  0,  for  all  y  c  Thus  xA  =  0,  but  xA^x^  =  (xA)(xA)*^  =  0. 

(iii)  •=»  (iv). 

If  xeR^  and  (xAx^)  <  0  then  xAx*^  =  0  and,  by  (iii),  xA^x"^  =  0, 

2  T 

contradicting  xA  x  =1. 

(1  v)  =»  (v). 

If  X  c  R*^  and  xAx"^  =  0  then,  by  (iv),  xA^x"^  <  0  (because  if  xA^x*^  >  0 

2  T  T 

then  we  could  normalize  x  to  get  xA  x  =1,  xAx  =  0).  However, 

2  T  T  2  T 

xA  X  =  (xA)(xA)  ,  and  thus  xA  x  >  0  with  equality  holding  if  and  only 
if  xA  =  0. 


(v)  =?  (i). 

Ti  T 

Suppose  (i)  is  false,  i.e.,  there  exist  x,  y  e R  such  that  xAx  >0, 

yAy^  <  0.  By  suitable  normalization  [dividing  x  by  (xAx)^^^  and  y  by 

T  1  /  2  T  T 

(-yAy)  ],  we  may  assume  that  xAx  =  1,  yAy  =  -1.  Now  let: 

.p  rp  2  ^  ^ 

(1)  X  =  -xAy  +  [1  +  (xAyM  ] 

(2)  z  =  Xx  +  y  . 


We  claim  that  zA  ^  0  and  zAz  =  0,  thus  contradicting  (v).  First,  if 

T  T 

zA  =  0  then  multiplying  (2)  by  Ax  and  Ay  we  get: 


T  T  T 

0  =  XxAx  +  yAx  =  X  +  xAy 

0  =  XxAy”^  +  yAy"^  =  XxAy"^  -  1  . 


Combining  the  last  two  equations: 

0  =  -  1  =  (-xAy^)(xAy^)  -  1  = 

T  2 

=  -l-(x.\yM^  , 


-2- 


a  contradiction,  thus  zA  ^  0.  However, 
zAz^  =  (X.X  +  y)A(X.x  +  y)^  = 

«  X^xAx^  -I-  2\xAy^  +  yAy^ 

=  +  ZXxAy"^  -  1  , 

and  X  was  chosen  to  be  precisely  one  of  the  two  (real)  roots  of  the  preceding 
quadratic  polynomial  inX.  q.  e.d. 

Several  comments  are  in  order.  Obviously,  A  is  s.d.  if  and  only  if 
A  is  p.s.d.  or  -A  is  p.  s.d.  Condition  (ii)  of  Theorem  1  is  a  generaliza¬ 
tion  of  the  Cauchy-Schwartz  inequality,  namely: 

(3)  (uv^)  <  (uu^)(vv^)  all  u,v€R'^, 

for  if  we  take  A  to  be  the  nxn  identity  matrix  which  is  clearly  p.  s.  d.  ,  we 
obtain  (3)  from  (ii)  -  Theorem  1.  Condition  (v)  -  Theorem  1,  or  its  obvious 
equivalents  (iii)  and  (iv),  states  that  if  we  consider  xA,  the  image  under  the 
linear  transformation  A  of  a  point  x  in  r’^,  then  A  cannot  be  perpendicular 
to  X  unless  x  is  in  the  null- space  of  A.  Alternately  ,  (v)  -  Theorem  1 
states  that  if  x  is  not  in  the  null-space  of  A  then  its  image  under  A  cannot 
be  perpendicular  to  x. 

We  proceed  next  to  obtain  results  which  are,  in  a  sense,  "refinements" 
of  conditions  (ii)  (see  Lemma  1  below)  and  (iv)  (see  Theorem  2)  of  Theorem  1. 
Lemma  1  is  a  generalization  of  the  well  known  fact  associated  with  the 
Cauchy-Schwartz  inequality,  stating  that  equality  holds  in  (3)  if  and  only  if 
u,  V  are  linearly  dependent.  We  shall  apply  Lemma  1  in  the  proof  of  Theorem 
3. 
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LEMMA  1 


Let  A  be  s.d..  If  x,  y  e  R"  then  (xAy  )  =  (xAx  )(yAy  )  if  and  only  if 

xA,  yA  are  linearly  dependent. 

T 

PROOF:  If,  say,  xA  =  X.yA,  where  \  is  a  real  number,  then  xAy  = 

T  T  T  T  2  T 

XyAy  while  xAx  =  XyAx  =  XxAy  =  X  yAy  .  Whence  it  follows  that 

T  2  ?  T  2  X  T. 

(xAy^)  =  X'^(yAy  )  =  (xAx  )(yAy  ). 

X  2  XT  X 

On  the  other  hand,  suppose  (xAy  )  =  (xAx  )(yAy  ).  If  xAx  =  0 

X 

or  yAy  =  0  then,  by  (v)  -  Xheorem  1,  xA  =  0  or  yA  =  0  and  we  certainly 

X 

can  conclude  that  xA,  yA  are  linearly  dependent.  Otherwise,  say,  xAx  >  0 

XX  X 

and  yAy  >  0,  consequently  xAy  A  0.  Let  p  =  signum  (xAy  )  and  let; 


.  =  (yAy^)*'^ 


(3  =  -p{xAx^)^^^ 


then  a,  P  ^  0  and; 

(qx  +  Py)A(ax  +  Py)*^  =  a^xAx^  +  p^yAy”^  +  ZapxAy"^  = 

=  2(xAx^)(yAy^)  -  2p{xAy^)(xAx^ )  ^^^(yAy^)  ^ = 

=  2{xAx^)(yAy^)  -  2  JxAy^J (xAx^ )  ^  ^^(yAy^)  ^ ^  = 

=  2(xAx^){yAy^)  -2(xAx^)J  yAy^)  =  C  . 

Xhus,  (ax  +  Py)A(ax  +  Py)*^  -  0  and,  by  (v)  -  Xheorem  1,  0=(ax  +  Py)A  = 

=  axA  +  PyA.  q.  e.  d. 

Xhe  preceding  lemma  was  motivated,  in  part,  by  an  examination  of 

(ii)  -  Xheorem  1  in  case  A  is  the  identity  matrix,  in  that  case  (since  the 

square  of  the  identity  is  the  identity),  (iv)  -  Xheorem  1  states;  xeR*', 

X  X  2 

XX  =  1  implies  (xx  )  >0,  which  is,  of  course,  true.  We  notice,  though, 

X  ^ 

that  (xAx  )  has  then  a  positive  lower  bound,  namely  1.  In  general,  this 
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will  be  the  case,  i.  e.  ,  a  positive  lower  bound  will  exist  for  (xAx  )  in 

(iv)  -  Theorem  1,  whenever  A  is  s.  d.  .  Clearly,  when  A  is  identically 

zero  any  positive  number  will  serve  as  a  lower  bound,  because  there  is  no 
n  2.  T 

xeR  for  which  xA  x  =1,  thus  we  will  exclude  A  =  0  in  the  next  theorem: 
TKi£:OREM  2 

Suppose  A  is  p.  s.d.  and  A  ^  0,  then  there  exist  a  positive  real  number  p 
and  an  x^eR*^  such  that; 

(4)  xeR  ,  xA  X  =1  13^  xAx  >  p 

2  T  T 

(5)  XqA  Xq  =  1  and  XqAXq  =  p. 

PROOF:  Let 

X  =  -f  X  I  x  €  r”  and  xA^x"^  =  1 
T 

p  =  inf  xAx 
X  e  X 

Since  A  is  p.  s.d.  and  A  ^  0,  p  is  well  defined  and  in  fact  p  >  0 
=  nd  satisfies  (4).  By  definition  of  p,  there  exists  a  sequence  Xj^  such  that 


.6) 

X,  eX 
k 

for  k  =  1 ,  2, 

{Tt 

XkAXk" 

converges  to  p 

We  consider  two  cases: 

Case  1.  The  sequence  ha  3  a  bounded  subsequeace .  In  this  eventuality 

the  Xj^  have  a  point  of  accumulation  x^,  for  which  it  must  be  true  (by  (6)  and 

T 

(7',  and  because  X  is  closed)  that  XqC  X  an  !  XqAxq  =  p.  Thus  x^j  satis¬ 
fies  (5).  That  p  is  positive  then  follows  from  (v)  -  Theorem  1.  The  two 
preceding  facts,  together  with  the  remark  above  that  p  satisfies  4,  complete 
the  orool. 
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Case  2.  The  sequence  bounded  subsequence,  i.  e.  ,  we  may 

assume  that  1  =  (Xj  and  |  Xj^j  >  0,  k=l,  2,  ..... 

We  define  another  sequence 


\^\ 

T  T 

Now,  converge  to  zero,  because  Xj^Ax^^  converge  to  fj.  and  also 

?  »T»  2  I 

Yj^A  converge  to  zero,  because  Xj^A  1  However,  ^  Yj^  (  ‘ 

thus  the  Vj^'s  have  an  accumulation  point  y,  for  which  it  must  be  true  that 
T 

yAy  =  0.  Thus  yA  =  0  by  (v)  -  Theorem  1. 

Next  we  observe  that  from  the  definition  of  y  and  the  Yj^' ®  follows 
that  whenever  y  has  a  non-zero  component  then  infinitely  many  Xj^'s  have 
the  same  component  non-zero,  and  in  fact  of  the  same  sign.  We  may  assume 
that  an  appropriate  subsequence  of  Xj^  has  been  selected  so  that  whenever 
y  has  a  positive  (negative)  component  then  all  the  have  the  same 

component  positive  (negative).  Now,  if  any  sequence  of  real 

numbers  then; 

(xj^  -1  ^kY)A(xj^  +  ^j^,y)^  = 

2  T  2  T 

and  (xj^  ^  ’^k 


because  yA  =  0.  We  can  thus  replace  Xj^  by  Xj^  +  ^j^y>  k  =  1,  2 .  and 

(6)  and  (7)  will  still  hold.  However,  by  an  appropriate  choice  of  we  can 

reduce  the  number  of  non-zero  components  in  each  of  the  ^'®>  eventually 
(repeating  the  above  process,  if  necessary)  we  obtain  a  sequence  ^  Xj^^  > 
satisfying  (6) -(7)  and  which  has  an  accumulation  point,  thus  reducing  it  to 
case  1.  q.e.d. 
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As  an  immediate  consequence  of  Theorem  2  we  can  "strengthen" 


(iv)  -  Theorem  1. 

Corollary 

If  A  is  s.d.  and  A  ^  0  then 


minimum 


{ 


xAx 


Tf| 


xcR‘ 


and 


xA^x^ 


=  1 


} 


exists  and  is  positive. 


PROOF;  As  noted  before,  if  A  is  s.d.  ,  then  either  A  is  p.  s.d.  or  -A  is 
p.  s.d.  ,  in  either  case  the  square  of  the  p  in  Theorem  2  is  the  required 
minimum  and  the  Xq  of  the  same  theorem  is  the  required  minimizing  x. 

The  (JL  and  Xq  of  Theorem  2  are,  as  one  might  expect,  intimately 
related  to  the  characteristic  values  of  A.  This  is  brought  forth  in  the  next 
theorem. 


THEOREM  3 

Let  A  be  p.  s.d.,  A  ^  0.  Let  p  and  Xq  be  as  in  Theorem  2  and  let  be 
the  largest  characteristic  value  of  A,  then  p  ^  and  x^A  is  a  character¬ 

istic  vector  of  A  corresponding  to 

PROOF:  Suppose  \  is  any  characteristic  value  of  A,  i.e.,  there  exists 

n  Z  T  T 

an  xeR  ,  x  ^  0,  such  that  xA  =  Xx,  whence  xA  x  =  XxAx  .  If  X  =  0  then 

certainly  X  <  p  Assuming  X  ^  0,  it  follows  that  xA  ^  0  (because  x  ^  0  ) 

T  2  T  -  1  /  2 

and  thus,  by  (v)  -  Theorem  1,  xAx  >  0.  Let  y  =  (xA  x  )  x,  then 

2  T  T  2  T  -1  T 

yA  y  =  1  and,  by  definition  of  p,  yAy  >  p.  However  yAy  =  (xA  x  )  (xAx  ) 

=  X  \  thus  X  <  p  We  have  just  demonstrated  that  X  <  p"^  for  any 

characteristic  value  X  of  A,  thus  X  <  p*^  . 

n  — 
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To  complete  the  proof  of  this  theorem  it  will  suffice  to  show  that  there 
is  a  characteristic  value  X.  of  A  such  that  X  =  (j.  and  (XqA)A  =  X(XqA), 
Xq  being  as  in  Theorem  2.  Let  x  =  Xq  be  a  minimizing  Xq  in  question. 

Since  A  and  A^  are  p.  s.d.  (the  square  of  any  real  symmetric  matrix  is 

T  T  3  T 

p.  s.d.),  and  xA  ^  0  (xAx  =  x^Axq  =  >  0),  it  follows  that  xA  x  = 

=  (xA)A(xA)'^  >  0,  and  xA'^x  =  (xA)A^(xA)^>  0.  Thus,  if  we  define 

(8)  p  =  2(xa\'^)(xA^x’^)' ^ 
then  p  is  positive.  Next  let 

(9)  y  =  X  -  pxA  . 

The  motivation  for  the  above  definition  of  y  is  as  follows;  we  know  x 
minimizes  a  certain  function,  namely  xAx,  since  we  wish  to  derive  from 
this  fact  some  properties  of  x  we  examine  how  xAx  will  change  in  the 
direction  of  its  gradient,  namely  2xA.  As  defined  in  (9),  y  is  a  translation 
from  X  precisely  in  the  direction  of  that  gradient,  the  particular  value  of  p 
chosen  is  designed  to  keep  y  within  the  "feasibility"  set,  i.  e.  ,  yA^y  =  1. 

We  check  next  the  last  mentioned  condition: 

yA^y^  =  (x  -  pxA)A^(x  -  pxA)^  = 

=  xA^x^  -  2pxA^x^  +  P^xA'^x^  = 

=  1  -  2p  [xA^x”^  -  -J(xA^x’^)] 

1  7  r  a3  T  ,  .3  T.,  .4  T.-l,  .4  tJ 

=  1  -  2p  j^xA  X  -  (xA  X  )(xA  X  )  (xA  x  )J 

=  1  . 

One  can,  incidentally,  readily  check  that  the  particular  value  of  p,  as  given 
in  (8),  is  the  only  value  of  p  (other  than  p  =  0)  which  yields  yA^y  =  1.  Now, 
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since  y A.  y  =1.  we  must  have,  by  definition  of  >ji, 

(10)  yAy"^  -  xAx'^  >  0. 

However , 

/ 

yAy^-  xAx^  =  (x-pxA)A(x-pxA)^  -  xAx^  = 

,  ,v2  T  ,  2  ,,3  T 

=  -ZpxA  X  +  p  xA  X  = 

,  r  p  .  .3  T.  ,  .2  T J 
=  2p  I  (xA  X  )  -  (xA  X  )J  = 

,  ,  ,4  T,-l  f,  .3  T,2  ,  .2  T.,  .4  T.l 
=  2p{xA  X  )  (xA  X  )  -  (xA  X  )(xA  x  )J  . 

4  T  -  1 

Thus,  since  p  >  0,  (xA  x  )  >0  and  because  (10)  holds,  we  have: 

(  1  1)  (xA  X  )  >  (xA^x  )(xA  X  )  . 

vV'e  now  refer  to  inequality  (3),  which  is  a  special  case  of  (ii)  -  Theorem  1 
with  A  being  the  identity,  letting  u  =  xA,  v  =  xA^  we  get: 

/  ,  /  a3  T.2  ^  ,  .2  T.,  .4  T. 

(12)  (xA  X  )  <  (xA  X  )(xA  X  )  . 

Combining  (11)  and  (  12),  we  get: 

,  ,3  T.2  ,  2  T.,  .4  T. 

(13)  (xj^  X  )  =  (xA  X  )(xA  X  )  . 

Hov/ever,  from  JLemma  1,  again  with  A  being  the  identity  matrix,  we  then 
know  that  xA,  xA^  are  linearly  dependent.  Since  xA  ^0,  it  follows  that 
there  is  a  real  number  X.  such  that  xA^  =XxA,  multiplying  by 
X  :  1  =  xA  X  -  \xAx  ,  and  X  =  (xAx  )  =  p  .  q.  e.  d. 

As  a  final  general  result,  we  specialize  (ii)  -  Theorem  1,  and  Lemma 
1,  for  the  case  where  A  is  non- singular . 


THEOREM  4 


Let  A  be  p.  s.d.  and  non-singular  then, 

(14)  (uv^)^  <  (uAu^)(vA  all  u.veR*' 

and  equality  holds  above  if  and  only  if  u,  vA  ^  are  dependent. 

PROOR  We  first  note  that  A  ^  must  be  symmetric  because  AA  ^  =  I, 
thus  =  I  =  (AA  =  (A  S^A^  =  (A  S^A.  But  the  inverse  is  unique, 

1  1  T’  r\ 

thus  A  =  (A  )  .  Next,  let  u,  v  e  R  ,  we  let 

(15)  X  =  u,  y  =  vA"^  . 

One  readily  checks  that: 

T  T  T  T  T  -1  T 

xAy  =  iiv  ,  xAx  =  uAu  ,  yAy  =  vA  v 

Thus  the  desired  inequality  (14)  follows  from  (ii)  -  Theorem  1.  Now  if  (14) 
is  actually  an  equation,  then  from  Lemma  1,  using  x,  y  as  defined  in  (15),  we 
get  u,vA  ^  are  linearly  dependent.  The  converse  also  follows  readily. 

q.  e. d. 

Note:  The  condition  of  equality  in  (14)  is  directly  connected  with 
characteristic  vectors  of  A  (and  of  course,  those  of  A”^,  for 
suppose  (14)  is  an  equation  and  u  =  v  ^  0,  then  one  sees  immediately 
that  uA  =  \u  for  some  real  number  \.  The  corresponding  converse 
also  holds  in  this  case. 

An  iterative  scheme,  for  deciding  the  definiteness  of  A,  based  on  the 
proof  of  Theorem  3  might  go  as  follows: 

(a)  By  examining  the  diagonal  elements  of  A  we  have  decided  that,  if 
at  all,  A  is  p.  s.d. 
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(b)  We  have  an  x  such  that  xA.  ^0;  if  xA  x  <  0  then  A  is  not  p.  s.d.  , 
if  xA^x  >  0  normalize  x  so  that  xA^x  =  1  and  proceed  to  (c) 

(c)  We  have  an  x  such  that  xA  ^0,  xA^x  =  1;  perform  the  transformation 
given  by  (8)  and  (9)-  There  are  three  cases: 

T  T 

Case  1.  if  yAy  >  xAx  then  A  is  not  p.  s.d. 

T  T 

Case  2.  if  yAy  <  xAx  return  to  beginning  of  (c),  using  y  as  the 

new  "test"  vector. 

T  T 

Case  3.  if  yAy  =  xAx  we  have  isolated  a  characteristic  vector 
of  A,  return  to  (b)  using,  as  x,  a  vector  independent  of  all 
characteristic  vectors  thur-  far  obtained. 

The  preceding  is,  of  course,  "informal"  in  the  sense  that  the  iterative 
procedui'e  described  above  has  not  been  shown  to  coverge. 
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